Clustering Algorithms for Content-Based Publication-Subscription Systems
نویسندگان
چکیده
We consider efficient communication schemes based on both network-supported and application-level multicast techniques for content-based publication-subscription systems. We show that the communication costs depend heavily on the network configurations, distribution of publications and subscriptions. We devise new algorithms and adapt existing partitional data clustering algorithms. These algorithms can be used to determine multicast groups with as much commonality as possible, based on the totality of subscribers’ interests. They perform well in the context of highly heterogeneous subscriptions, and they also scale well. An efficiency of 60% to 80% with respect to the ideal solution can be achieved with a small number of multicast groups (less than 100 in our experiments). Some of these same concepts can be applied to match publications to subscribers in real-time, and also to determine dynamically whether to unicast, multicast or broadcast information about the events over the network to the matched subscribers. We demonstrate the quality of our algorithms via simulation experiments.
منابع مشابه
A partition-based algorithm for clustering large-scale software systems
Clustering techniques are used to extract the structure of software for understanding, maintaining, and refactoring. In the literature, most of the proposed approaches for software clustering are divided into hierarchical algorithms and search-based techniques. In the former, clustering is a process of merging (splitting) similar (non-similar) clusters. These techniques suffered from the drawba...
متن کاملAdaptive Content-based Routing using Subscription Subgrouping in Structured Overlays
Cyclic or general overlays may provide multiple paths between publishers and subscribers. However, an advertisement tree and a matching subscription activates only one path for notifications routing in publish/subscribe systems. This poses serious challenges in handling network conditions like congestion, and link or broker failures. Further, content-based dynamic routing of notifications requi...
متن کاملMin-Cost Matchmaker Problem in Distributed Publish/Subscribe Infrastructures
The publish/subscribe (pub/sub) paradigm provides content-oriented data dissemination, where communication channels between content providers and content consumers are set up on the basis of interest matches between content provided by the publishers and content requested by the subscribers. In this paper, we study a distributed matchmaker system which resides on the data dissemination path, in...
متن کاملEfficient Subscription Management in Content-based Networks
Content-based publish/subscribe systems offer a convenient abstraction for data producer and consumers, as most of the complexity related to addressing and routing is encapsulated within the network infrastructure. A major challenge of content-based networks is their ability to efficiently cope with changes in consumer membership. In our XNET XML content network, we have addressed this issue by...
متن کامل